critical review
Cost-Aware Prediction (CAP): An LLM-Enhanced Machine Learning Pipeline and Decision Support System for Heart Failure Mortality Prediction
Yu, Yinan, Dippel, Falk, Lundberg, Christina E., Lindgren, Martin, Rosengren, Annika, Adiels, Martin, Sjöland, Helen
Objective: Machine learning (ML) predictive models are often developed without considering downstream value trade-offs and clinical interpretability. This paper introduces a cost-aware prediction (CAP) framework that combines cost-benefit analysis assisted by large language model (LLM) agents to communicate the trade-offs involved in applying ML predictions. Materials and Methods: We developed an ML model predicting 1-year mortality in patients with heart failure (N = 30,021, 22% mortality) to identify those eligible for home care. We then introduced clinical impact projection (CIP) curves to visualize important cost dimensions - quality of life and healthcare provider expenses, further divided into treatment and error costs, to assess the clinical consequences of predictions. Finally, we used four LLM agents to generate patient-specific descriptions. The system was evaluated by clinicians for its decision support value. Results: The eXtreme gradient boosting (XGB) model achieved the best performance, with an area under the receiver operating characteristic curve (AUROC) of 0.804 (95% confidence interval (CI) 0.792-0.816), area under the precision-recall curve (AUPRC) of 0.529 (95% CI 0.502-0.558) and a Brier score of 0.135 (95% CI 0.130-0.140). Discussion: The CIP cost curves provided a population-level overview of cost composition across decision thresholds, whereas LLM-generated cost-benefit analysis at individual patient-levels. The system was well received according to the evaluation by clinicians. However, feedback emphasizes the need to strengthen the technical accuracy for speculative tasks. Conclusion: CAP utilizes LLM agents to integrate ML classifier outcomes and cost-benefit analysis for more transparent and interpretable decision support.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Europe > Sweden > Västerbotten County > Umeå (0.04)
Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge
Liang, Hao, Wu, Ruitao, Zeng, Bohan, Niu, Junbo, Zhang, Wentao, Dong, Bin
Multimodal reasoning remains a fundamental challenge in artificial intelligence. Despite substantial advances in text-based reasoning, even state-of-the-art models such as GPT-o3 struggle to maintain strong performance in multimodal scenarios. To address this gap, we introduce a caption-assisted reasoning framework that effectively bridges visual and textual modalities. Our approach achieved 1st place in the ICML 2025 AI for Math Workshop \& Challenge 2: SeePhys, highlighting its effectiveness and robustness. Furthermore, we validate its generalization on the MathVerse benchmark for geometric reasoning, demonstrating the versatility of our method. Our code is publicly available at https://github.com/OpenDCAI/SciReasoner.
Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy
Hussain, Aftab, Rabin, Md Rafiqul Islam, Ahmed, Toufique, Xu, Bowen, Devanbu, Premkumar, Alipour, Mohammad Amin
Large language models (LLMs) have provided a lot of exciting new capabilities in software development. However, the opaque nature of these models makes them difficult to reason about and inspect. Their opacity gives rise to potential security risks, as adversaries can train and deploy compromised models to disrupt the software development process in the victims' organization. This work presents an overview of the current state-of-the-art trojan attacks on large language models of code, with a focus on triggers -- the main design point of trojans -- with the aid of a novel unifying trigger taxonomy framework. We also aim to provide a uniform definition of the fundamental concepts in the area of trojans in Code LLMs. Finally, we draw implications of findings on how code models learn on trigger design.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)
- Research Report (0.82)
- Overview (0.68)
Beyond Accuracy: A Critical Review of Fairness in Machine Learning for Mobile and Wearable Computing
Yfantidou, Sofia, Constantinides, Marios, Spathis, Dimitris, Vakali, Athena, Quercia, Daniele, Kawsar, Fahim
The field of mobile and wearable computing is undergoing a revolutionary integration of machine learning. Devices can now diagnose diseases, predict heart irregularities, and unlock the full potential of human cognition. However, the underlying algorithms powering these predictions are not immune to biases with respect to sensitive attributes (e.g., gender, race), leading to discriminatory outcomes. The goal of this work is to explore the extent to which the mobile and wearable computing community has adopted ways of reporting information about datasets and models to surface and, eventually, counter biases. Our systematic review of papers published in the Proceedings of the ACM Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) journal from 2018-2022 indicates that, while there has been progress made on algorithmic fairness, there is still ample room for growth. Our findings show that only a small portion (5%) of published papers adheres to modern fairness reporting, while the overwhelming majority thereof focuses on accuracy or error metrics. To generalize these results across venues of similar scope, we analyzed recent proceedings of ACM MobiCom, MobiSys, and SenSys, IEEE Pervasive, and IEEE Transactions on Mobile Computing Computing, and found no deviation from our primary result. In light of these findings, our work provides practical guidelines for the design and development of mobile and wearable technologies that not only strive for accuracy but also fairness.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Law (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- (5 more...)
A critical review of the EU's 'Ethics Guidelines for Trustworthy AI'
Europe has some of the most progressive, human-centric artificial intelligence governance policies in the world. Compared to the heavy-handed government oversight in China or the Wild West-style anything goes approach in the US, the EU's strategy is designed to stoke academic and corporate innovation while also protecting private citizens from harm and overreach. In 2018, the European Commission began its European AI Alliance initiative. The alliance exists so that various stakeholders can weigh-in and be heard as the EU considers its ongoing policies governing the development and deployment of AI technologies. Since 2018, more than 6,000 stakeholders have participated in the dialogue through various venues, including online forums and in-person events.
- Europe (0.55)
- Asia > China (0.25)
- North America > United States > California (0.05)
Why are we failing at the ethics of AI? A critical review
Anja Kaspersen and Wendell Wallach are senior fellows at Carnegie Council for Ethics in International Affairs. In November 2021, they published an article that changed the AI ethics conversation: Why Are We Failing at the Ethics of AI? Six months later, the questions the article raised are no closer to resolution. This article was a don't-hold-your-punches review on the state of AI ethics, with which I am in almost complete agreement. If we want to advance the AI conversation, this is still a good place to start. I've quoted a portion of their article, with my comments interspersed: While it is clear that AI systems offer opportunities across various areas of life, what amounts to a responsible perspective on their ethics and governance is yet to be realized.
- North America > United States (0.05)
- Europe > United Kingdom > England (0.05)
A Critical Review of Information Bottleneck Theory and its Applications to Deep Learning
In the past decade, deep neural networks have seen unparalleled improvements that continue to impact every aspect of today's society. With the development of high performance GPUs and the availability of vast amounts of data, learning capabilities of ML systems have skyrocketed, going from classifying digits in a picture to beating world-champions in games with super-human performance. However, even as ML models continue to achieve new frontiers, their practical success has been hindered by the lack of a deep theoretical understanding of their inner workings. Fortunately, a known information-theoretic method called the information bottleneck theory has emerged as a promising approach to better understand the learning dynamics of neural networks. In principle, IB theory models learning as a trade-off between the compression of the data and the retainment of information. The goal of this survey is to provide a comprehensive review of IB theory covering it's information theoretic roots and the recently proposed applications to understand deep learning models.
- Overview (0.87)
- Research Report (0.69)
A critical review on computer vision and artificial intelligence in food industry
Food demand and sustainability to feed the growing population are explained clearly. The technological innovations including 4.0 industry revolution strengthen the agricultural sector. The usage of computer vision and artificial intelligence in the field of agriculture and food industry is deeply elaborated. Emerging technologies such as computer vision and Artificial Intelligence (AI) are estimated to leverage the accessibility of big data for active training and yielding operational real time smart machines and predictable models. This phenomenon of applying vision and learning methods for the improvement of food industry is termed as computer vision and AI driven food industry.
A critical review of Star Wars AI
This article has spoilers for just about the entire Star Wars universe. When it comes to fictional portrayals of artificial intelligence technology, the Star Trek universe stands head and shoulders above all others. Series creator Gene Rodenberry's vision for the far future seems just as prescient today, in the era of advanced deep learning, as it did in the 1960s when he unveiled it. Unfortunately, this article is about the AI in Star Wars. Before I go off the rails, I should point out that I'm a light saber-wielding Star Wars fanatic.
- Media > Film (1.00)
- Leisure & Entertainment (1.00)